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ABSTRACT 

X-ray crystallography provides excellent struc- 
tural data on protein-DNA interfaces, but crystallo- 
graphic complexes typically contain only small 
fragments of large DNA molecules. We present a 
new approach that can use longer DNA substrates 
and reveal new protein-DNA interactions even in ex- 
tensively studied systems. Our approach combines 
rigid-body computational docking with hydrogen/ 
deuterium exchange mass spectrometry (DXMS). 
DXMS identifies solvent-exposed protein surfaces; 
docking is used to create a 3-dimensional model 
of the protein-DNA interaction. We investigated 
the enzyme uracil-DNA glycosylase (UNG), which 
detects and cleaves uracil from DNA. UNG was 
incubated with a 30 bp DNA fragment containing a 
single uracil, giving the complex with the abasic 
DNA product. Compared with free UNG, the UNG- 
DNA complex showed increased solvent protection 
at the UNG active site and at two regions outside 
the active site: residues 210-220 and 251-264. 
Computational docking also identified these two 
DNA-binding surfaces, but neither shows DNA 
contact in UNG-DNA crystallographic structures. 
Our results can be explained by separation of 
the two DNA strands on one side of the active 
site. These non-sequence-specific DNA-binding 
surfaces may aid local uracil search, contribute to 
binding the abasic DNA product and help present 



the DNA product to APE-1, the next enzyme on the 
DNA-repair pathway. 

INTRODUCTION 

The determination of protein-DNA interactions can 
be instrumental for understanding function, for designing 
experiments to probe biological mechanisms and for 
developing new drugs. X-ray crystallography provides 
high-resolution structures of protein-DNA complexes, 
but these complexes are difficult to crystallize. When 
high-quality crystals of protein-DNA complexes are 
obtained, they typically contain only small DNA frag- 
ments due to constraints imposed by crystal packing (1). 
Therefore, there is a need for new methods that examine 
the complete protein-DNA interaction. Here, we use an 
innovative experimental/theoretical approach that com- 
bines rigid-body macromolecular docking with hydrogen 
/deuterium exchange mass spectrometry (DXMS). 
Computational docking guides the design of the DXMS 
experiment, which identifies exposed protein surfaces in a 
solution environment. Docking is then used to interpret 
the experimentally determined DNA footprint, producing 
a 3-dimensional model of the protein-DNA interaction. 

Despite significant progress in applying macromolecular 
docking methods to protein-protein complexes (2-4), the 
prediction of protein-DNA interactions remains a largely 
unaddressed challenge (5). So far, there have been only a 
few applications of macromolecular docking methods 
to the prediction of protein-DNA complexes (6-14). We 
developed the global, systematic search program DOT 
(15,16), in which interaction energies are calculated as 
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the sum of electrostatic and van der Waals components. 
Tests on transcription factor proteins demonstrated that 
rigid-body docking with DOT successfully identified 
dsDNA-binding sites and the orientation of the DNA at 
those sites (9). Furthermore, the ensembles of favorable 
DNA placements indicated the degree of bending of 
bound DNA over the protein surface. 

DXMS has proved a powerful method for studying 
protein interactions. In DXMS, hydrogen/deuterium 
exchange is followed by protein proteolysis and character- 
ization of the resulting peptides by mass spectrometry, 
revealing the degree of solvent exposure for backbone 
amide hydrogen atoms throughout the protein chain. By 
examining the change in solvent exposure between the 
unbound and DNA-bound protein, DXMS has the poten- 
tial to reveal the DNA footprint on the protein surface 
(17,18). 

We applied our combined computational docking/ 
DXMS approach to the essential DNA-repair enzyme 
uracil-DNA glycosylase (UNG), which cleaves uracil 
from ssDNA and dsDNA by hydrolysis of the 
N-glycosylic bond between uracil and the deoxyribose. 
Extensive studies of UNG (19,20) include examination 
of its sequence specificity (21,22), its interactions with 
undamaged DNA (23-25) and with other proteins 
involved in DNA binding and repair (26-28), and its 
search mechanism (20,29,30). Crystallographic structures 
of the catalytic domain of human UNG bound to a 10 
base pair abasic DNA product (26,31) and to dsDNA 
analogs (24,32) show that uracil and its associated sugar 
are rotated x 180° out of the base stack. The UNG active 
site consists of a deep uracil-binding pocket with an 
overlying groove that binds one DNA strand (see 
Supplementary Figure SI A). The Leu 272 side chain 
inserts through the DNA minor groove to replace the 
flipped-out uracil nucleotide. Mutagenesis studies have 
identified active-site residues involved in catalysis and spe- 
cificity of the uracil-binding pocket (31,33,34) and support 
the proposed role of residue Leu 272 (31). 

Our goals were to evaluate the ability of our combined 
approach to identify the known UNG active site and to 
probe the full extent of the UNG-DNA contact surface. 
The solution DXMS studies allowed use of a 30 bp DNA 
fragment, considerably longer than the 10 bp DNA used 
in the X-ray crystallographic studies. The exhaustive 
search performed by the computational docking algorithm 
explored the entire surface of UNG. Together, these two 
techniques provide strong evidence that the DNA-binding 
surface on UNG extends considerably beyond the imme- 
diate active site. 



MATERIALS AND METHODS 

Protein and DNA preparation for DXMS 

The full catalytic domain of human UNG (21) was 
expressed and purified. In this UNG construct, the 85 
N-terminal residues were replaced by a 22 amino acid 
His tag (MGSSHHHHHHSSGLVPRGSHMG). The 
final UNG stock solution had a protein concentration 
of 9.9mg/ml (0.36 mM) in a buffer of 10 mM Tris, 



lOmM NaCl, ImM DTT, pH 7.5. Oligonucleotides were 
obtained from IDT (Integrated DNA Technologies). 
One strand contained deoxy-U: 5'-ctgtuatcttgatcgatc- 
gatcgatcgatc-3'. The other strand was biotinylated at 
the 5'-end: 5'-biotin-gatcgatcgatcgatcgatcaagatgacag-3'. 
The two oligos were annealed to give 30 bp dsDNA 
with a U:G mismatch pair. The DNA stock solution 
had a DNA concentration of 56mg/ml(2.97mM) in 
lOmM NaCl, lOmM Tris, pH 7.5. 

Characterization of the UNG-DNA complex 

We verified that a 1:1 complex of UNG with product 
DNA was formed under the conditions and protein and 
DNA concentrations used in the DXMS experiments. 
Activity assays were performed with [ 3 H]dUMP- 
containing DNA substrate in the presence or absence of 
the same amount of non-labeled DNA used in the DXMS 
experiments. The reactions (21 contained 1.8 nmol 
UNG in 3.4 mM Tris, pH 7.4, 10 mM NaCl, 0.28 mM 
DTT and 1.8 uM [ 3 H]dUMP-containing calf thymus 
DNA (sp. act. 0.5mCi/mmol). Prior to addition of 
UNG, 56 ug of non-labeled calf thymus DNA was 
added to half of the reaction mixtures. After incubation 
for 90min at room temperature, the amount of uracil 
released was measured as described (35). Notably, 
all DNA uracil was released from the substrate both 
in the presence and absence of high concentrations of 
non-labeled DNA (data not shown), demonstrating that 
UNG was active under the assay conditions. 

The solution state of the UNG-DNA complex was 
assessed using multiangle light scattering (MALS) mass 
measurements (Supplementary Methods). UNG-DNA 
complexes were examined at 1:1 and 1:1.6 stoichiometric 
ratios. UNG at a concentration of 5mg/ml (in lOmM 
NaCl, 10 mM Tris, pH 7.5) was pre-incubated with 
either 1 or 1.6 equivalents of the 30 bp dsDNA at room 
temperature for 90min. Notably, the two stoichiometric 
ratios gave similar results: a single elution peak with light 
scattering masses of 35.39 (±3%) and 38.97 (±4%) kDa 
for the 1:1 and 1:1.6 ratios, respectively. The DNA has a 
mass of 18.8 kDa, consistent with a 1:1 complex of UNG 
and DNA (36-38) at both a 1:1 ratio and the 1:1.6 ratio 
used in the DXMS experiments. 

Establishing optimal proteolysis conditions for DXMS 

We first determined the concentrations of the denaturant 
guanidine hydrochloride that gave overlapping peptides 
spanning the full UNG sequence. Peptides should be 
long enough to be uniquely identified but short enough 
to localize changes in solvent protection. Samples of the 
UNG stock solution (5ul 1.8 nmol) were diluted with 
15 ul of 1.7 mM Tris (pH 7.1), lOmM NaCl, and then 
mixed with 30 jal of quench solution [0.08 M, 0.8 M, 1.6 
M, 3.2 M or 6.4 M guanidine hydrochloride in 0.8% (v/v) 
formic acid, 16.6% (v/v) glycerol] on ice. The UNG 
samples were then subjected to proteolysis, and the result- 
ing peptides were separated and analyzed by mass spec- 
trometry (39) (details in Supplementary Methods). The 
resulting fragmentation maps revealed that both 0.8 M 
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and 3.2 M guanidine hydrochloride (0.5 M and 2.0 M final 
concentration) gave the best results. 

DXMS 

After establishing good fragmentation maps, UNG and 
the UNG-DNA complex were subjected to hydrogen/ 
deuterium exchange experiments followed by mass spec- 
trometry analysis to determine the degree of deuteration 
of UNG backbone amide hydrogen atoms. To prepare 
samples of the UNG-DNA complex, 5 ul of the UNG 
stock solution (1.8 nmol) was incubated with 1.0 ul 
(3.0 nmol) of the dsDNA stock solution (ratio of 1:1.6 
UNG:DNA) at room temperature for 90min then 
cooled to 0°C. Deuterium oxide (D 2 0) buffer at 0° C 
[15 ul, 1.7 mM Tris, 10 mM NaCl, pD (read) 7.1] was 
added to 5ul samples (1.8 nmol) of UNG and to the 
prepared UNG-DNA samples. Samples were incubated 
for 10, 30, 100, 300, 1000, 3000 and 10 000 s at 0° C and 
for 1000, 3000, 10 000 and 30 000 s at room temperature. 
Hydrogen exchange rates are about 10 times faster at 
room temperature (40), so these experiments are equiva- 
lent to 10 000, 30 000, 100 000 and 300 000 s at 0° C. The 
data shown in Figure 2 and Supplementary Tables S1-S3 
are given in terms of the deuteration times at 0° C. 
Samples were quenched with 30 ul of 0.5 M (final concen- 
tration) guanidine hydrochloride solution for UNG or 
with 30 ul of 2.0 M (final concentration) guanidine hydro- 
chloride solution for the UNG-DNA complex, and then 
proteolyzed and analyzed by mass spectrometry (Supple- 
mentary Methods). Non-deuterated and fully deuterated 
samples were analyzed for comparison. UNG (5ul) 
was mixed with 15ul of 1.7mM Tris (pH 7.1), lOmM 
NaCl on ice (non-deuterated sample) or with 15ul 0.5% 
formic acid in D 2 0 overnight at room temperature 
(fully deuterated sample). To check for consistency and 
correct peptide identification, we examined all overlapping 
peptides within each data set. Each peptide typically was 
present in multiple charge states, with each identified 
and analyzed independently, providing a further check 
of consistency. 

Docking procedure and analysis 

Coordinates of unbound UNG (PDB code 1AKZ, reso- 
lution 1.57 A) (33) and of UNG bound to a 10 bp DNA 
fragment base (1SSP, resolution 1.9 A) (26) were obtained 
from the Protein Data Bank (41). The human UNG con- 
struct in the crystallographic structures contains the full 
catalytic domain, with the N-terminal tail (84 residues) 
replaced by Met-Glu-Phe. In 1SSP, the DNA strand 
bound in the active site has the sequence 5'-ctgtuatctt-3', 
but the uracil base is cleaved. The complementary strand 
has A opposite the U and an additional 5' overhang of 
a single adenine. A linear 1 1 bp B-DNA model was 
built with the Nucleic Acid Builder (NAB) program (42) 
(see Supplementary Methods) with the sequences 5'- 
ctgtuatcttt-3' and 5'-aaagatgacag-3', creating a U:G 
mismatch pair. Minimization with AMBER 8 using the 
generalized Born model (43,44) gave a wobble geometry 
with two hydrogen bonds for the U:G pair. 



Docking calculations were performed with the program 
DOT (15,16), which is part of the DOT2 Suite distributed 
by the Computational Center for Macromolecular Struc- 
ture at the San Diego Supercomputer Center (URL: 
http://www.sdsc.edu/CCMS). The DNA molecule, repre- 
sented by its atomic positions with partial atomic charges, 
was systematically moved within the shape and electro- 
static potentials calculated for the stationary UNG 
molecule. Potentials were calculated using utilities in the 
DOT2 Suite (Supplementary Methods), including use 
of the program REDUCE (45) to add hydrogen atoms, 
determine His side chain protonation states, and correct 
the geometry of Asn, Gin and His side chains; the 
program MSMS (46) to calculate molecular surfaces that 
encompass the volumes defining the UNG shape poten- 
tial; the AMBER library of heavy atoms with added polar 
hydrogens (47) to assign partial atomic charges; and 
the program UHBD (48) to calculate the electrostatic 
potential of UNG by finite difference methods to solve 
the linearized Poisson-Boltzmann equation. Poisson- 
Boltzmann methods take into account the effects of dielec- 
tric, solvation and ionic strength on the electrostatic po- 
tential. The continuous electrostatic potential was then 
modified (9,16) to be compatible with the discontinuous 
shape potential. 

Docking calculations (Supplementary Methods) used a 
cubic grid 128 A on a side with 1 A grid spacing (about 2.1 
million points). The DNA was centered at each grid point 
in 54000 distinct orientations, giving 108 billion place- 
ments of the DNA about UNG. The 2000 placements with 
the most favorable interaction energies, calculated as the 
sum of electrostatics and van der Waals intermolecular 
energy terms, were kept. These energies were mapped to 
the grid point at which the DNA was centered, allowing 
the distribution of the placements over the UNG to be 
visualized. The 30 top-ranked placements using coordin- 
ates from the UNG-DNA crystallographic complex were 
analyzed by calculating the rmsd between the docked 
DNA and the crystallographic DNA. Calculation of 
rmsd is a poor method for clustering B-DNA placements 
(6,9), because lack of sequence recognition results in shifts 
along the DNA axis by one or more base pairs within the 
same cluster. Instead, the 30 top-ranked placements and 
the distribution of the 2000 top-ranked placements over 
the UNG surface were analyzed with computer graphics 
(Figure 1C and Supplementary Figure S2A). 



RESULTS 

We applied computational docking and DXMS to the 
interaction of the catalytic domain of human UNG 
with DNA. Using the macromolecular docking program 
DOT (15,16), the DNA was systematically translated and 
rotated around UNG, resulting in « 108 billion place- 
ments of the DNA, which were ranked by the sum of 
electrostatic and van der Waals energies. The electrostatic 
energy term is an essential component of the ranking 
function for highly polar interactions. For protein-DNA 
complexes, DOT provides a good approximation of the 
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Figure 1. B-DNA docked to the DNA-bound structure of UNG (gray Col backbone). (A) The 30 top-ranked B-DNA placements compared with 
bound DNA from 1SSP (blue phosphate backbone): 21 (yellow) at the active site; 6 (green) tightly clustered at a secondary site; 1 (magenta) between 
the active site and the secondary site; and 2 (orange) near the UNG N-terminus. (B) The larger active-site cluster (14 placements) replicates the 
UNG-DNA active-site contacts found in the 1SSP complex, including insertion of Leu 272 (black) into the DNA minor groove. These dockings also 
show direct contact of the complementary strand with residues 210-220 (magenta). In all, the active-site strand has the same 5' to 3' direction as the 
crystallographic DNA, as indicated by red coloring of the 3'-ends. The UNG backbone is colored by the DXMS results (see Figure 3). (C) The 2000 
top-ranked B-DNA placements, represented by their geometric centers (spheres), are concentrated over the active site (indicated by the crystallo- 
graphic DNA, blue, right), at the secondary site (indicated by docked B-DNA, green, left), and between the two sites. 



electrostatic energy calculated for the full complex by 
Poisson-Boltzmann methods (9). 

Three computational dockings were done. First, UNG 
and DNA coordinates from the crystal complex (PDB 
code 1SSP) (26) were docked to test the docking param- 
eters and energy evaluation. Second, the DNA-bound 
UNG coordinates and a linear B-form DNA model 
(B-DNA) were docked to evaluate the fit of B-DNA to 
the optimized active site. Third, B-DNA and the unbound 
human UNG structure (PDB code 1AKZ) (33) were 
docked to evaluate the ability to identify critical features 
of the biological interaction in the absence of the known 
structure of the complex. 

We performed two DXMS experiments: UNG alone 
and UNG bound to a 30 bp dsDNA fragment that 
contained a U:G base pair. Before the hydrogen/ 
deuterium exchange experiment, the DNA and UNG 
were preincubated, resulting in the formation of the 
complex of UNG with product DNA from which the 
uracil base had been cleaved. 

UNG-DNA: docking coordinates from the crystal 
complex 

Docking the abasic DNA product to the DNA-bound 
UNG coordinates reproduced the crystallographic 
UNG-DNA complex 1SSP. The 1SSP UNG construct 
(26) contains the full catalytic domain, but lacks 84 
N-terminal residues of full-length UNG. In the 10 bp 
1SSP DNA, the uracil has been cleaved from the strand 
bound in the UNG active site and there is a 1-nt overhang 
at the 5'-end of the complementary strand. Twenty-seven 
of the 30 top-ranked DNA placements docked close to the 
crystallographic position (rmsd <5A; Table 1) and show 
an excellent fit to the 4 nt in the UNG active-site groove 
(Supplementary Figure S1A). As in 1SSP, neither the 
DNA major groove, which lies over residues 210-220 on 
the 3' side of the active site (direction based on the 



active-site strand), nor the minor groove on the opposite 
(5') side of the active site directly contacted UNG. Two 
of the 30 top-ranked DNA placements (Table 1 and 
Supplementary Figure SIB) bound the complementary 
DNA strand in the active-site groove, demonstrating 
that a DNA strand with geometry close to B-DNA and 
with stacked bases can fit into the UNG active site. These 
two placements showed potential new contacts with UNG 
residues 210-220 on the 3' side of the active site. The final 
placement in the top 30 overlapped the 3' side of the crys- 
tallographic DNA and extended over the surface created 
by residues 248-268 (Supplementary Figure SIB). Thus, 
this rigid-body docking unambiguously identified the crys- 
tallographic complex as the dominant cluster, but also 
suggested additional UNG-DNA contacts not present in 
1SSP. 

Docking B-DNA to the DNA-bound UNG structure 

We next investigated if the pre-formed DNA-binding site 
on UNG could accommodate B-DNA. Our 1 1 bp B-DNA 
model matched the sequence of the crystallographic DNA, 
except that the U:A base pair was replaced with a U:G 
mismatch, the best substrate for UNG. Since UNG inter- 
rogates extrahelical bases (23,24), we did not expect 
specific recognition of U within our B-DNA model. 
Therefore, we defined clusters based on shared alignment 
of the DNA axis and phosphate groups, allowing shifts 
along the DNA axis by one or more base pairs. 

Twenty-one of the 30 most favorable B-DNA place- 
ments bound in the active-site groove (Figure 1A) with 
the correct 5' to 3' direction of the active-site strand and 
Leu 272 inserted into the DNA minor groove. Two 
distinct active-site clusters were formed (Table 1). In the 
larger cluster (14 structures), one DNA strand fully 
occupied the active-site groove (Figure IB). The comple- 
mentary strand contacted residues 210-220 on the 3' side 
of the UNG active site (Table 1). In the smaller cluster 
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Table 1. Distribution of top 30 DNA placements from computational docking over the UNG surface 
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a UNG residues within 4.5 A of docked DNA that are not listed in Parikh et al. (1998) (26) for the structure of the UNG-DNA complex (1SSP). 
b The rmsd values were calculated between the heavy (non-hydrogen) atoms of the docked DNA placements and the crystallographic position of the 
DNA, given a fixed position for UNG. 

TJNA placements lying between the active site and the predicted secondary site, with partial overlap of one or both sites. 



(seven structures), one DNA strand only partially filled 
the active-site groove and showed new contacts with the 
Tyr 248 and Lys 251 side chains. 

Unexpectedly, a secondary DNA-binding site distant 
from the UNG active site was found (green, Figure 1A, 
Table 1). The docked cluster (six structures) showed tight 
alignment of the phosphate backbones and the DNA axes, 
with a maximum spread of 9°, which includes translation 
over the curved UNG surface. Potential DNA-contacting 
residues include the positively charged side chains of 
Lys 259, 286, 293 and 296 and Arg 258 and 260. 
Residues 258-260 are closest to the active site, about 30 
A from the uracil-binding pocket. The smaller active-site 
cluster is pointed towards this secondary DNA-binding 
site. One B-DNA placement (magenta, Figure 1A) 
partially overlapped this active-site cluster and the 
secondary-site cluster, suggesting a continuous DNA- 
binding surface from the active site to the secondary 
DNA-binding site. 

Two structures (orange, Figure 1A) lay near the 
truncated N-terminus of the UNG catalytic domain. 
These would clash with the N-terminal region of full- 
length UNG. In 1SSP, bound DNA contacts the truncated 
N-terminus of a UNG molecule in a neighboring asym- 
metric unit. Thus, computational docking indicated a 
non-physiological UNG-DNA crystallographic inter- 
action. 

The distribution of the 2000 most favorable B-DNA 
placements (Figure 1C) supported a DNA-binding 
surface that extends from the active site to the predicted 
secondary DNA-binding site. The preponderance docked 
at the active site, at the well-defined secondary binding 
site, or between the two sites. 

Docking B-DNA models to the free UNG structure 

Finally, we applied DOT to the typical situation where 
only the isolated structure of the protein is known. 
The most favorable 30 and 2000 B-DNA placements 
showed the same distribution, with the majority docked 
at the active site and others at the secondary 



DNA-binding site or between the two sites (Table 1 and 
Supplementary Figure S2A). The majority at the active 
site showed a reversed 5' to 3' direction for the active-site 
DNA strand, positioning the DNA major groove, rather 
than the minor groove, over Leu 272 (Supplementary 
Figure S2B). Relaxing the shape fit, which can be useful 
for unbound protein-protein dockings (16), did not 
improve results (Supplementary Methods), as we also 
found when docking DNA to transcription factors (9). 

DXMS of free and DNA-bound UNG 

UNG alone and the pre-formed UNG-DNA complex 
were examined by DXMS. DXMS experiments were per- 
formed at low ionic strength (lOmM NaCl), where recom- 
binant UNG shows high activity (21), good substrate 
binding (20,21), a highly efficient search mechanism 
(29,30,49), and efficient uracil excision (29). All of these 
functions become less efficient as the ionic strength is 
raised (29), with no detectable activity at 200 mM NaCl 
(21). The low ionic strength should also shift the equilib- 
rium of the UNG-DNA product interaction to the bound 
state, enhancing our ability to detect changes in solvent 
protection. 

For the UNG-DNA complex, we designed a 30 bp 
dsDNA with the potential to reach from the active site 
to the secondary DNA-binding site predicted by compu- 
tational docking. One end of the DNA matched the 
sequence of the 10 bp fragment in 1SSP, except that the 
U:A pair was replaced by a U:G pair. Twenty base pairs 
were added to the 3'-end of the U-containing strand. 
To form the UNG-DNA complex, a 1:1.6 ratio of 
UNG and the designed U-containing DNA were 
preincubated for 90 min. After this time, uracil cleavage 
was complete, as shown by activity assays, and a 1:1 
complex was formed, as determined by MALS 
('Materials and Methods' section). 

In the DXMS studies, UNG alone and the pre-formed 
UNG-DNA complex were exposed to deuterium for 10 
time points ranging from 10 to 300000 s. The solutions 
were then quenched and subjected to protease digestion. 



Nucleic Acids Research, 2012, Vol. 40, No. 13 6075 



The resulting peptides were separated by HPLC and 
analyzed by electrospray mass spectrometry. For UNG 
alone, 128 overlapping peptides were identified. For the 
UNG-DNA complex, 138 overlapping peptides were 
identified. In both cases, peptides spanned the entire 
UNG sequence. The two data sets shared 108 peptides, 
allowing direct comparisons of the change in deuteration. 
The percentage of deuteration for each peptide was 
determined by comparison with undeuterated and fully 
deuterated UNG samples that were subjected to the 
same quench and proteolysis conditions ('Materials and 
Methods' section). 

The first two backbone nitrogen atoms of each peptide 
exchange rapidly under the experimental conditions 
following the deuteration step (40), so neither contributes 
to the deuteration count. For example, the mass envelope 
corresponding to peptide 140-157 provides deuteration 
information only for residues 142-157. We use the 
residue range for which there is deuteration information 
in all Figures and Tables, for example residues 142-157, 
rather than the full peptide. 

Peptides with 4-25 amides common to UNG 
(Supplementary Table SI) and the UNG-DNA complex 
(Supplementary Table S2) were analyzed to obtain the 
change in deuteration (Supplementary Table S3). In 
these Tables and Figure 2, which shows the 30, 300 
and 10000 s deuteration times, peptides are assigned to 
nine distinct regions following the fragmentation pattern. 
UNG showed no continguous unstructured regions, which 
would be fully deuterated within 10 s (50). In free UNG, 
most peptides showed <40% deuteration at the 30 s 
deuteration time (Figure 2, top), with deuteration grad- 
ually increasing at longer times. At 10 000 s, regions 4 
and 5 showed the least deuteration (10-50%) and regions 
1 and 8 showed the most deuteration (75-100%). 

Examination of non-overlapping peptides at all deutera- 
tion times (Supplementary Figure S3) revealed three 
regions with significant decreases in deuteration: residues 
142-157 (region 3), 210-220 (region 6) and 245-274 
(region 8). Together with active-site residues 160-170 
(region 4), which showed a subtle change in deuteration, 
these regions include all of the DNA- and 
uracil-contacting UNG residues in the UNG-DNA crys- 
tallographic structures (26), except for residues 275 and 
276. In addition, residues 111-131 (region 2) and 277- 
290 (region 9) showed small decreases in deuteration 
upon DNA binding. 

Active site residues 142-157 (region 3) and 160-170 
(region 4) 

Residues 142-158 play a central role in binding the ura- 
cil ring and the catalytic water molecule (26). Residues 
142-157 showed a decrease of 1-3 deuterons in the 
presence of DNA (Figure 3A). Overlapping residues 
142-158 and 145-157 in region 3 showed consistent de- 
creases (Supplementary Table S3). 

Residues 160-170 showed a subtle difference in 
deuteration in the free and DNA-bound UNG states 
that appeared with deuteration times longer than 
10000 s (Figure 3A). With five Pro residues within this 
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Figure 2. Percent deuterium incorporation for peptides after 30 
(black), 300 (green) or 10000 s (magenta) are shown for UNG (top) 
and the UNG-DNA complex (middle). The change in deuteration 
(bottom) is shown for peptides common to both data sets, where a 
negative percentage indicates less deuteration in the UNG-DNA 
complex. Regions are defined as in Supplementary Figure S3 and 
Supplementary Tables S1-S3. 
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Figure 3. Distribution of UNG regions showing significant solvent protection in the presence of DNA. (A) Peptides on the active-site face of UNG. 
Residues 142-157 (purple), 160-170 (blue green) and 210-220 (magenta) are highlighted on the UNG Ca backbone (right) and correspond to the 
deuteration profiles (left). Residues 210-220 show the greatest change in solvent protection, but have no contact with the bound DNA product (light 
blue phosphate backbone) in the 1SSP crystallographic structure. (B) Peptides that span the region between the active site and the secondary 
DNA-binding site predicted by computational docking. Residues 245-248 (green) and residues 251-264 (red) are highlighted on the UNG structure 
(right) and correspond to deuteration profiles. Residues 265-274 (orange) are shown on the UNG structure (left), but the deuteration profile is for 
residues 258-274, which partially overlap residues 251-264 (red). Residues 251-264 show the greatest change in solvent protection and make up part 
of the predicted secondary DNA-binding site (indicated by the docked DNA, green phosphate backbone), but have no DNA contacts in 1SSP. 



segment, at most six amides can exchange. At long 
deuteration times, free UNG picked up two additional 
deuterons, while DNA-bound UNG had little change. 
This pattern was consistent with the deuteration profiles 
of residues 160-171, 161-170 and 160-174 in region 4. In 
bound UNG, a hydrogen bond is formed between the Ser 
169 amide and a DNA phosphate group (26). In free 
UNG (33), the Ser 169 amide forms a hydrogen bond 
with a water molecule that fits into a pocket on the 
UNG surface and therefore may exchange slowly with 
bulk solvent. A plausible explanation of the deuteration 
profile for residues 160-170 is that two amides among 
residues 160-162 and 164, which are away from the 
active site, exchange with solvent similarly in free and 
DNA-bound UNG, whereas slowly exchanging residues 
169 and 170 exchange even more slowly in DNA-bound 
UNG. 

Strong protection of residues 210-220 (region 6) and 
251-264 (region 8) in the presence of DNA 

The striking decrease in deuteration in the presence of 
DNA found for residues 210-220 (Figure 3 A) was also 
seen for residues 210-222 and 210-224 (Figure 2, 
Supplementary Table S3). This strong protection is incon- 
sistent with the UNG-DNA crystallographic structures, in 
which the DNA major groove lies over residues 213-219, 
but the molecular surfaces are separated by at least 7 A 
as a result of packing interactions in the crystal 
(see 'Discussion' section). Residues 114-126 (region 2), 
of which residues 116-119 contact residues 210-220, 
were responsible for the decrease of 1-2 deuterons seen 
for residues 111-131 in the presence of DNA. 

Analysis of overlapping peptides within residues 245-274 
revealed that the region farthest from the active site, 
residues 251-264, was primarily responsible for the 
observed decrease in deuteration upon DNA binding 
(Figure 3B). As observed for residues 210-220, residues 
251-264 showed a striking decrease in deuteration in the 
presence of DNA, with one deuteron incorporated. 



Residues 251-264 include the solvent-exposed loop 
258-262 (sequence Arg-Lys-Arg-His-His), which forms 
part of the secondary DNA-binding site predicted by com- 
putational docking. The decrease of 1-2 deuterons in the 
presence of DNA for residues 277-290 can be explained by 
the decrease seen for residues 277-280. These residues 
contact the P-strand formed by residues 262-267 and imme- 
diately follow residues Tyr 275 and Arg 276, which contact 
the DNA minor groove on one side of the active site. 

Despite the DNA contacts with UNG seen in 1SSP, the 
beginning and ending segments of residues 245-274 
showed smaller changes in their deuteration profiles in 
the presence of DNA. Residues 258-274 had a decrease 
of 6-7 deuterons at deuteration times of 300 s or more, 
but due to overlap with residues 251-264, at most three 
amides within 265-274 could be protected in the presence 
of DNA. This increased solvent protection is likely due 
to DNA contacts with loop 268-274 (5 amides, 2 Pro), 
including insertion of Leu 272 into the DNA base stack 
and hydrogen bonding of the backbone amide of residue 
268 with a DNA phosphate group (26). 

Residues 245-248 showed a decrease in deuteration 
at short and long deuteration times (Figure 3B). In 
bound UNG (26), the amide of residue 247 forms a 
hydrogen bond with a bound DNA phosphate oxygen 
atom. Residues 245 and 246 are the most buried of the 
four residues, with both amides hydrogen-bonded to other 
residues, while residues 247 and 248 are on the UNG 
surface. A plausible explanation for the change in 
deuteration profile is that partial burial of amide 247 by 
DNA slows the exchange of both amide 247 and the two 
buried amides. 



DISCUSSION 

The combination of hydrogen/deuterium exchange data 
and computational modeling has proven useful for 
constructing models of amyloid peptide oligomerization 
(51-55) and the assembly of pilin proteins into bacterial 
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filaments (56). Here, we have demonstrated that the 
combination of computational docking and DXMS is a 
powerful tool for revealing the DNA footprint on a 
protein. Our initial goal was to test the ability of this 
approach to distinguish the UNG active site. Our results 
went far beyond this: we found a significantly larger 
DNA-binding surface on UNG than seen in crystallo- 
graphic structures. Computational docking guided our 
choice of a 30 bp dsDNA fragment for our DXMS 
study, considerably longer than the 10 bp DNA used in 
crystallographic studies. DXMS supported the computa- 
tional docking results, showing interactions of the abasic 
DNA product with two distinct regions on the 3' side of 
the UNG active site. 

Identification of the UNG active site 

Computational docking of the unbound structure of UNG 
and a B-DNA model found the largest concentration of 
favorable-energy DNA placements at the UNG active site, 
identifying the active-site groove and loop residues 
268-276 as important DNA-contacting regions. Essential 
Leu 272 stands out as a surprisingly hydrophobic side 
chain with direct contact to DNA (Supplementary 
Figure S2B). Computational docking was less successful 
in identifying the DNA surface that contacts the UNG 
active site. Reversal of the active-site strand packed the 
DNA major groove, rather than the minor groove, against 
the Leu 272 side chain. Strand direction reversal was also 
seen in DNA dockings to the transcription factor FadR 
(9), which, like UNG, causes a widening of the minor 
groove of the bound DNA. Therefore, the switching of 
minor and major groove surfaces needs to be considered 
when interpreting the docking of DNA models to 
unbound protein structures. 

DXMS supported the computational docking but, by 
itself, did not definitively identify the active site. Peptides 
away from the immediate active site showed more 
pronounced solvent protection in the presence of DNA 
than peptides forming the immediate active site. This is 
a consequence of the fragmentation pattern. Active-site 
peptides included unprotected segments outside the 
active site, but two peptides outside the active site 
created continuous surfaces that were almost completely 
protected from solvent in the presence of DNA. 
Computational rigid-body docking provided essential 
structural interpretation of the DXMS data by unambigu- 
ously identifying the active site. 

Two DNA-binding regions on the 3' side of the UNG 
active site 

Computational docking and the dramatic change in 
solvent protection found by DXMS support two distinct 
DNA-contact surfaces, created by residues 210-220 and 
251-264, on the 3' side of the UNG active site. Although 
DNA binding can indirectly increase solvent protection of 
amide protons by formation of large assemblies (39), con- 
formational change (50), or stabilization of unstructured 
regions (57,58), direct DNA contact is the most plausible 
mechanism for UNG. UNG is a single-domain protein 
that forms a 1:1 complex with the 30 bp DNA product. 



Large conformational changes in UNG are unlikely, given 
the strong conservation of the UNG backbone found in 
crystallographic structures of human (26,31,33,59), 
Escherichia coli (60) and Herpes simplex virus UNG 
(61), alone or in complex with DNA or the inhibitor 
protein UGI. Active-site residues of UNG show induced 
dynamics upon binding to undamaged DNA (25) that 
may reflect the clamping movement that occurs upon 
binding both non-target (24) and target DNA (26). In 
this movement, the two lobes on either side of the active 
site, which include residues 210-220 and 251-264, close 
down to narrow the active-site groove by about 2 A. 
These concerted movements do not alter residue inter- 
actions, secondary structure, or hydrogen-bonding 
patterns within each lobe. DXMS (Figure 2) demonstrates 
that the catalytic domain of UNG contains no unstruc- 
tured regions, ruling out an unstructured to structured 
transition. In free UNG, the only solvent-exposed 
peptides that show strong solvent protection encom- 
pass well-defined helices (105-108, 173-177, 225-235, 
281-290). All other strongly protected regions are buried 
and include (3-strands (136-139, 197-203). The structures 
of residues 210-220 and 251-264 contain a variety of 
amide proton environments (Supplementary Table S4), 
including amide protons with no hydrogen bonds. These 
irregular, solvent-exposed structures are unlikely to have 
the capability to form hydrogen bonds sufficiently strong 
to cause the protection seen at long deuteration times. 

Docking B-DNA to the DNA-bound UNG structure 
demonstrated how DNA can extend from the active site 
over residues 210-220 and 251-264. In the larger 
active-site cluster, DNA bound in the UNG active site 
and contacted residues 210-220 with just a small change 
in orientation relative to the crystallographic DNA. 
Potential DNA contacts include main-chain atoms, the 
side chains of residues Gin 213, Asn 215, Lys 218 and 
Glu 219, and the Arg 210 and Arg 220 guanidinium 
groups, which are spaced to interact with two adjacent 
phosphate groups (Supplementary Figure S4A). The 
smaller active-site cluster contacted residues 248 and 251 
in an orientation appropriate for extending over residues 
251-264, with additional DNA placements supporting a 
continuous DNA-binding region that extends to the pre- 
dicted secondary DNA-binding site (Supplementary 
Figure S4B). Potential DNA contacts include main-chain 
atoms of residues 251-264, and side chains of Lys 251, Ser 
254, residues 258-260 (sequence Arg-Lys-Arg, part of the 
secondary DNA-binding site), His 262 and Leu 264, which 
together form a surface groove between the active and 
secondary sites. DNA contacts at the secondary site 
include the positively charged side chains of residues 
258-260 and Lys 286, 293 and 296 (Supplementary 
Figure S4C). The one studied mutation within residues 
251-264, replacement of His 261 by Leu, shows a 71% 
drop in activity (33), which is significant, given that His 
261 is about 30 A from the active site. The His 261 side 
chain extends into the UNG interior (Supplementary 
Figure S4B), so the mutation acts indirectly, perhaps 
through changes in adjacent His 262 or solvent-exposed 
loop 258-260. 
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In contrast with the docking and DXMS results, three 
human UNG-DNA structures [1SSP (26), 4SKN (31) and 
20YT (24)] show no DNA contact with residues 210-220. 
A fourth structure, 20XM (24), has a single contact of 
the His 212 side chain with an extrahelical thymine. 
However, all four structures suggest the potential for 
DNA interaction with residues 210-220. The DNA 
major groove lies over the surface created by these 
residues, but a 6-10 A layer of water molecules separates 
the protein and DNA surfaces. This water layer is con- 
nected to bulk solvent, so from the crystallographic struc- 
tures, we would predict that surface residues 210-220 
would show significant deuteration in the presence of the 
bound DNA product. 

Our examination of the full crystal environment 
resolved the apparent inconsistency between our DXMS 
results on residues 210-220 and the four human UNG- 
DNA crystallographic complexes, all of which have a 
single UNG-DNA pair in the asymmetric unit. In all 
cases a neighboring UNG molecule creates a wedge 
between UNG and its bound DNA (Supplementary 
Figure S5A). This neighboring UNG molecule also 
forms a pocket near its N-terminus that completely enve- 
lopes the exposed surface of the 5' overhanging adenine of 
the complementary DNA strand, putting strong con- 
straints on the DNA geometry. The result is virtually iden- 
tical DNA structures on the 3' side of the active site 
(Supplementary Figure S5B). These interactions may be 
required for successful crystallization of the UNG-DNA 
complex, but they are clearly not physiological, since they 
involve an end of the DNA fragment and the truncated 
N-terminus of UNG, both artifacts. Crystal packing 
may also influence the DNA geometry on the 5' side of 
the active site, where both the DNA geometry and type 
of crystal contacts vary among the four structures 
(Supplementary Figure S5B). The strong influence of 
crystal packing raises doubts that the DNA geometry 
away from the immediate active site reflects the biological 
interaction, particularly given the DXMS results for 
residues 210-220. 

The four UNG-DNA crystallographic structures show 
no evidence for interaction of the bound product DNA 
with residues 251-264. With the abasic site bound at the 
active site, the 10 bp crystallographic DNA fragment is 
not long enough to reach this region. 

An expanded picture of DNA binding 

With little change in UNG structure, it is difficult to 
envision how the bound dsDNA product can maintain 
Watson-Crick base pairing and simultaneously contact 
the two surfaces created by residues 210-220 and 
251-264. The simplest explanation for strong protection 
of both surfaces is separation of the two DNA strands on 
the 3' side of the UNG active site. 

In our proposed model (Figure 4), the DNA is bound 
with the abasic site in the UNG active site, matching the 
position of the crystallographic 10 bp DNA on the 5' side 
of the active site and in the active site. On the 3' side, 
the active-site strand (light blue) extends from the active 
site toward the predicted secondary site, contacting the 




Figure 4. Model of the 30-bp product dsDNA bound to UNG. 
Both DNA strands in the model align with the crystallographic DNA 
(gray phosphate backbone) on the 5' side of the active site. On the 3' 
side of the active site, the active-site strand (light blue) contacts the 
groove created by residues 251-274, including the continuous surface 
formed by main-chain atoms of residues 251-264 (red) and 265-274 
(orange). The complementary strand (blue) contacts the groove 
created by residues 210-220, including the surface created by the 
main-chain atoms (magenta), and may also contact the surface 
created by main-chain atoms of residues 251-258 (red). 



shallow surface groove created by residues 251-274. 
The complementary strand (blue), with a small shift 
from its crystallographic position, contacts the shallow 
groove created by residues 210-220 and then extends 
over residues 251-258 to meet the active-site strand. 
Interestingly, the surfaces at the bottom of both shallow 
grooves are created by main-chain atoms: the exposed 
edge strand (residues 262-266) of the central (3-sheet 
(Figure 4, red and orange) and the interlocking (3-turn 
structure of residues 210-220 (Figure 4, magenta). 
Main-chain atoms frequently form hydrogen bonds with 
DNA phosphate groups (62) and here may provide an 
organized hydrogen-bonding platform for non-sequence- 
speciflc DNA interactions. 

The extensive DNA-binding surfaces on the 3' side 
of the active site may have important functional 
roles. Extensive weak interactions with product DNA 
could contribute incrementally to the strong affinity of 
the product [K D = 6nM (26)], and compensate for the 
energy needed for UNG-induced strand separation. In 
1SSP, the distortions of the active-site strand around the 
abasic site appear hidden by the overlying complementary 
strand (Supplementary Figure S5A). Strand separation of 
the dsDNA product on the 3' side of the active site could 
help expose the conformational changes of the processed 
active-site strand, providing a mechanism for recognition 
of the UNG-bound dsDNA product by APE-1, the next 
enzyme in the base-excision repair pathway. These 
DNA-binding surfaces may also assist the local search 
for uracil. NMR studies find that UNG has a passive 
role in dsDNA base pair opening, but substantially in- 
creases the lifetime of an extrahelical base (23,63). Our 
predicted DNA-binding surfaces provide a large region 
for trapping a spontaneously opened base during loose 
association of UNG and dsDNA, initiating the local 
search for uracil. With this mechanism, UNG takes 
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advantage of the faster spontaneous opening rates of A:T, 
A:U and G:U base pairs relative to G:C, focusing the local 
search on dsDNA regions most likely to contain U. 

Our results bring up the question of whether strand 
separation might apply to dsDNA substrate as well as 
product. Strand separation or partial melting of dsDNA 
substrate induced by UNG has been suggested as a 
binding mechanism (21,64) based on the sequence- 
dependent rate of uracil removal in dsDNA and the lack 
of sequence dependence in ssDNA (21,22). Strand separ- 
ation allows UNG to use the same mechanism for local 
uracil search in both dsDNA and ssDNA substrates (64), 
explaining their similar rates of uracil cleavage [«3-fold 
faster in ssDNA (21)]. On the other hand, NMR studies 
find no evidence for strand separation for an undamaged 
10-bp DNA fragment (63). Furthermore, a high local GC 
content generally causes a slower rate of uracil cleavage 
(21), so dsDNA flexibility may be a major source of the 
sequence-dependent effect (65). However, some substrates 
with a low GC content show reduced uracil cleavage rates 
(21), so DNA flexibility alone does not completely explain 
the observed sequence specificity. 

The combination of DXMS and computational 
docking has revealed new aspects of the well-characterized 
interaction of UNG and DNA. Computational docking 
provided the initial evidence for a more extensive 
DNA-binding surface than seen in crystallographic struc- 
tures. DXMS, as the experimental technique to test this 
hypothesis, has two key advantages over X-ray crystallog- 
raphy: it uses a solution environment and puts no con- 
straints on the length of the DNA. The results found by 
DXMS and computational docking for active-site 
peptides are consistent with the crystallographic structures 
of UNG and other base-repair enzymes (66,67), but 
extend this information in important ways. The two 
DNA-binding surfaces adjacent to the active site found 
by DXMS cannot be identified in the UNG-DNA crys- 
tallographic structures because of the influence of crystal 
packing and the short, 10 bp DNA. The overall UNG 
mechanism requires capabilities beyond the catalytic 
reaction — the impressive detection of uracil amid vast 
numbers of undamaged bases and the delivery of 
the toxic abasic product to the apurinic/apyrimidinic 
endonuclease, APE-1. Although our methods are lower 
resolution than the crystallographic structures, they have 
defined new DNA-binding regions at the UNG residue 
level that may be essential for these critical functions. 
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